Valence — Technical Specifications

The local engine

A full model runs on your machine out of the box — no account, no key, no internet.

Bundled modelGemma 4 E2B-it, served by an embedded llama-server sidecar built from llama.cpp.

AccelerationGPU-accelerated via CUDA; exposes a local OpenAI-compatible endpoint.

SetupAuto-starts on first launch. No API key, no daemon, no configuration.

ConnectivityFully offline. The local model never makes a network request.

Bring your own localDrop in your own downloaded models and run them on the same built-in engine — most popular open models work.1 Prefer Ollama? Point Valence at your install and use those too.

1 Local models use the GGUF format (the file format llama.cpp loads). Most open-source models are published in GGUF or can be converted to it. ↩

Providers & models

Bring your own key (BYOK) for any cloud provider — your keys are stored encrypted on your machine and calls go straight to the provider, never through Helix.

Provider	Representative models	Type
Local (bundled) Gemma 4 E2B-it · llama-server	The fastest path to a fully offline conversation.	Local
Your own local models most popular open models	Load models you've downloaded yourself and run them on the built-in engine — fully offline.	Local
Ollama your Ollama library	Already use Ollama? Connect it and use those models too — air-gapped.	Local
OpenAI GPT-4.1 · GPT-4o · o-series	Fast generalist coverage for writing, coding, multimodal.	BYOK
Anthropic Claude Opus 4 · Sonnet 4 · 3.5	Strong analysis, careful reasoning, review-heavy work.	BYOK
Google Gemini 2.5 / 3 Pro · Flash	Fast synthesis and broad coverage for research.	BYOK
xAI Grok 4	Frontier general model.	BYOK
Azure OpenAI enterprise tier	OpenAI models under your Azure tenancy.	BYOK
Vertex AI Google enterprise	Gemini under Google Cloud governance.	BYOK
OpenRouter 100+ aggregated	One key, the long tail of models.	BYOK

VisionVision-capable models analyze images you paste or drag into the chat.

Switching@-mention any configured model mid-thread; context carries across the switch.

Memory

Persistent, on-device recall — the model keeps useful context across sessions without anything leaving your disk.

Store a fact#remember{ } — e.g. #remember{user prefers Python}. Saved to local memory.

Recall#recall{ } searches your memory for relevant facts and feeds them back into the conversation.

StoragePlain JSON on your disk (Documents\AIOverlay\LocalMemory\) — per profile. Never transmitted.

Scales into HIVEWhen HIVE (Helix's document-memory platform) is connected, the same commands route to durable, searchable memory across documents, conversations and logs. HIVE is in active development.

Abilities & MCP tools

Give the model a real action layer — search the web, read your files, query databases — through the Model Context Protocol.

How it worksConfigure a stdio MCP server in Settings; Valence discovers its tools automatically and each becomes an ability the model can invoke, with results fed back into the thread.

Example servers

Filesystem — @modelcontextprotocol/server-filesystem
Brave Search — server-brave-search (web search)
Memory — server-memory
GitHub — issues, repos, and more

Built-in abilitiesTrigger systems and built-in commands (#remember, #recall) work the same way — invoked inline during conversation.

Files & vision

AttachmentsDrop in a CSV, PDF, or whole directory and the model reads them as context.

ImagesPaste or drag images into the chat for vision-capable models to analyze.

Workspace

TabsRun separate threads in parallel; provider selection is per tab, so workflows don't collide.

CompareSend the same prompt through two or three models and read the answers side by side.

ConversationsSave, load, search (by title or content), sort and delete. Import existing conversations from elsewhere.

Profiles & personas

Each profile carries its own model, system prompt, preferences and memory — and what one knows, the next can't see.

Per-profileCustom system prompts and preferences for code review, writing, research, family — swap posture without opening a new app.

Sealed memoryMemory is partitioned per profile (profiles/<name>/). Your work profile and your home profile never share context.

Storage & data

You can open, read and audit everything Valence stores — it's all on your machine.

Settings%AppData%\AIOverlayV2.1\settings.json

Saved chatsDocuments\AIOverlay (configurable) — stored as JSON with full message metadata.

API keysEncrypted with Windows DPAPI — they never leave your machine in plain text.

Telemetry to Helix0 bytes. Cloud calls go straight to the provider you chose; Helix is never in the path.

Kids Mode

Guardrail tiers5 tiers across ages 3–17 — reading level, topic limits and refusal rules locked to the profile.

Forced localKids profiles can only use the on-device model. The cloud picker is removed — no override.

Parent dashboardReview every conversation, enforce daily session limits, see flagged topics. Nothing is hidden.

Keyboard shortcuts

New chatCtrl + L — clear / start a new conversation

Copy threadCtrl + Shift + C — copy entire conversation

FindCtrl + F — search within the current conversation

Licensing

Price$59.95 once. No tiers, no feature gating, no “pro” tax — the entire product surface.

Trial7 days · no card · no account.

Running costThe local model costs nothing per chat. Cloud usage bills to your own provider account, never to Helix.

PlatformWindows desktop application. Mac & iOS planned.

Technical specifications.